Forensic Science International: Genetics
○ Elsevier BV
Preprints posted in the last 90 days, ranked by how well they match Forensic Science International: Genetics's content profile, based on 24 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Navarro Vera, I.; Bonilla, A.; Tirapu, M.; Albert, M.; Jimenez, P. P.; Herranz-Rodrigo, D.; Cruz-Alcazar, R.; Garcia, C.; Yravedra Sainz de los Terreros, J.
Show abstract
The geographical and familial origins of Christopher Columbus have remained a subject of intense historiographical debate for over five centuries. Despite numerous hypotheses, empirical genetic evidence capable of resolving his ancestral history or place of birth has been absent from the literature until now. This study presents the third stage of the first forensic genetic analysis performed on skeletal remains belonging to several direct descendants of Columbus, spanning the 16th to 18th centuries. By applying Massively Parallel Sequencing (MPS) to analyse autosomal, X- and Y- chromosome DNA markers, and integrating the results with multidisciplinary evidence from historical, genealogical, archaeological, and anthropological research implicated in this project, the identification of several individuals founded in the Crypt of Santa Maria de Gracia located in Gelves (Sevilla, Spain) has been achieved. The analysis of their biological relatedness enabled the reconstruction of kinship networks among the individuals interred in the crypt, which, when interpreted in the context of documented genealogical lineages, provides indirect but consistent evidence pointing toward the debated origin of the discoverer.
Vol, E.; Waldman, S.; Lomes, A.; Brielle, E. S.; Appel, N.; Dolin, B.; Asif, S.; Nagar, Y.; Marco, E.; Bergman, N.; Khaner, O.; Raviv, D.; Oliel, J.; Lewis, R. Y.; Carmi, S.
Show abstract
Genome-wide technologies can generate investigative leads in cold cases by determining the genetic ancestry of the forensic sample. Increasingly, DNA extraction and whole-genome sequencing or genotyping are being used to analyze early or middle-20th century skeletal remains. Here, we present the first case, to our knowledge, of whole-genome sequencing of a middle-20th-century bone sample from the Middle East. A femur discovered in a cave in Central Israel was proposed to belong to a person of Ashkenazi Jewish ancestry who was missing since 1948. Following DNA extraction and single-stranded library preparation, whole-genome sequencing generated nearly 500 million reads. However, only 0.5% of the reads mapped to the human genome, providing depth of coverage of 0.07x. After quality control and male sex inference, ancestry assignment was performed using principal components and ADMIXTURE analyses. The results suggested that the genome definitively belonged to a person of Arab ancestry, refuting the hypothesis of an Ashkenazi Jewish origin.
Gill, M. U.; Akhtar, M.
Show abstract
Due to the limited availability of reliable and well-validated molecular markers, the determination of postmortem interval (PMI) is still a major obstacle for forensic investigators to resolve a case. The largest human protein, known as titin, has never undergone at domain level examination of postmortem degradation patterns. This study focused on the In-silico analysis of the Immunoglobulin-like, fibronectin-type III, and Protein kinase domains of human titin to assess their potential utility in PMI estimation. Sequence data for the studied domains were retrieved from UniProt, 2D & 3D models were generated by PSIPRED and SWISS-MODEL, respectively, followed by physicochemical properties, solubility assessment, and structural comparison. This study revealed that the Ig-like domain is the most stable, followed by the Fn-III and Protein kinase domains. These findings indicate that Titin domains may degrade at different rates in the postmortem period. This study introduces the first computational basis for considering Titin as a multi-domain candidate biomarker for PMI estimation, laying the groundwork for upcoming laboratory validation.
Akane, O.; Kawaguchi, Y. W.; Niwa, T.; Uno, Y.; Kuraku, S.
Show abstract
The effective management of threatened shark populations relies on accurate demographic data, particularly operational sex ratios. While sex identification in intact shark bodies is straightforward through the presence of external male organs, namely claspers, it remains impossible for processed fins in the illegal wildlife trade, early-stage embryos in breeding programs, or archived tissue fragments and blood samples where morphological traits are lost. Here, we present a robust molecular sexing framework leveraging recently identified sequences from shark sex chromosomes, consistently organized in the XY system, to our current knowledge. Our approach consists of two distinct methodologies tailored to the the current identification status of sex chromosome sequences in the target species. For the whale shark Rhincodon typus and the brownbanded bamboo shark Chiloscyllium punctatum, we employed end-point PCR assays targeting male-specific Y-linked markers. For the cloudy catshark Scyliorhinus torazame, we developed a quantitative PCR (qPCR) assay targeting differential X chromosome dosage. In this dosage-based system, females (XX) are distinguished by an amplification profile approximately one cycle earlier than males (XY). By integrating X-linked dosage quantification, our framework provides a critical internal control that significantly enhances reliability, allowing researchers to distinguish true females from PCR failures. This toolkit offers a versatile solution for diverse applications, ranging from the study of sex determination mechanisms in pre-phenotypic embryos to the reconstruction of sex ratios from space-constrained tissue archives and global wildlife forensics, thereby contributing to the comprehensive conservation of shark biodiversity.
Monte, R. E. C.; Magnusson, R.; Söderberg, C.; Green, H.; Elmsjö, A.; Nyman, E.
Show abstract
Subtyping of ketoacidosis, a metabolic state characterized by blood acidification due to various causes, remains challenging in forensic casework. Postmortem omics samples paired with machine learning offers an independent tool to address this challenge. However, such data, especially related to real forensic cases, are rare. In Sweden, high-resolution mass spectrometry data routinely collected in forensic toxicology, can be leveraged for metabolomic analysis. Here, we integrate postmortem metabolomics and machine learning models to detect and subtype ketoacidosis-related deaths using real forensic cases in Sweden. From femoral blood samples of 109 alcoholic ketoacidosis cases, 220 diabetic ketoacidosis cases, 140 hypothermia cases, and 1,229 controls (hanging cases), we developed and tested three machine learning models, which achieved over 90% accuracy in ketoacidosis detection and over 80% in subtyping. Validation with independent cohorts (21 starvation cases, 29 alcoholic controls, and 40 diabetic controls) confirmed robustness with over 80% of starvation cases classified as ketoacidosis-related. Feature clustering highlighted metabolites such as cortisol to be important for subtyping. In summary, our findings demonstrate that combining machine learning with postmortem metabolomics enables accurate detection and subtyping of ketoacidosis-related deaths, which is useful for forensic casework.
Shen, Y.; He, K.; Wang, W.; Huang, L.; Chen, J.
Show abstract
In wildlife forensic practice, species identification and estimation of the Minimum Number of Individuals (MNI) for highly processed specimens have long relied on weight-based conversion methods, which may result in underestimation of the number of individuals involved in a case. Focusing on confiscated casque products of the helmeted hornbill (Rhinoplax vigil), this study combines macroscopic morphological examination with mitochondrial DNA barcoding (16S rRNA, COI, and Cytb) to explore a more robust approach for individual quantification. The results demonstrate that the conventional "weight-based" approach overlooks critical biological information contained in anatomical structures and cannot accurately reflect the actual number of individuals involved. Based on this, we propose an anatomy-based criterion centered on the principle of structural uniqueness: specimens retaining biologically unique beak or casque structures should be directly assigned to a single individual, whereas weight-based estimation should only be applied when original anatomical features are entirely absent. In addition, considering material loss during processing, we propose approximately 85 g as a reference threshold for estimating the number of individuals in heavily processed solid casque products. This approach improves the scientific rigor and accuracy of forensic identification and provides reliable technical support for the conviction, sentencing, and law enforcement of wildlife trafficking cases involving helmeted hornbill and other endangered species.
Eulenfeld, T.; Collatz, M.; Braun, S. D.; Ehricht, R.
Show abstract
IntroductionAccurate in silico evaluation of primers and probes is essential for the rational design of molecular multi-parameter assays. We present Assay-BLAST v2 to automate and simplify this process for extensive assay designs. ResultsA newly integrated strand and proximity check enables precise validation of corresponding oligonucleotides, ensuring correct orientation and spacing for efficient amplification. Based on predicted oligonucleotide interactions, Assay-BLAST v2 estimates amplification outcomes, offering a computational benchmark for downstream wet-lab validation and performance correlation. Additionally, the updated software integrates an adaptive BLAST parameter optimization that dynamically scales with database size, thereby improving both analytical sensitivity and computational performance. These improvements are supported by a comparative evaluation against the previous version of AssayBLAST. ConclusionsCollectively, these enhancements streamline the assay development workflow, reduce costs associated with suboptimal primer and probe synthesis, and increase the robustness and reliability of molecular diagnostics and research applications.
Butty, V.; Patel, P.; Levine, S. S.
Show abstract
DNA labelling fluorescent dyes such as ethidium bromide have long been considered to be highly mutagenic during DNA replication. While recent studies have pushed back on this narrative, the intercalative nature of these dyes continues to raise the possibility that these dyes can induce mutations. The iconPCR instrument by n6tec uses fluorescent dyes to measure amplification in real time and to adjust cycling conditions. However, since this use of qPCR is preparative and not analytical, mutations introduced by fluorescent dyes would be propagated into the sequencing reaction. To address the impact of these dyes on downstream analyses, we have performed routine mutation calling as well as mutational signature analysis on samples amplified using the iconPCR in the presence of either SYBR or EvaGreen. Sequence analysis revealed very minimal impacts of dyes on the reactions, largely within the noise regimen with only subtle changes in mutation rates seen. Mutational signature analysis was unable to identify any key signatures assignable to the dyes in either substitutions or indel domains. The mutational impact of intercalating dyes during fluorescence-guided amplification is therefore minimal and can be disregarded in all but the most sensitive NGS applications.
Honka, J.; Salazar, D.; Askeyev, A. O.; Askeyev, I. V.; Askeyev, O. V.; Aspi, J.; Asylgaraeva, G. S.; Niskanen, M.; Mannermaa, K.; Olli, S.; Piipponen, N.; Piliciauskiene, G.; Shaymuratova, D. N.; Valiev, R. R.; Kvist, L.
Show abstract
The early evolutionary history of modern domestic horses (Equus caballus/E. ferus caballus), known as the DOM2 lineage, is well documented due to numerous archaeological and ancient DNA (aDNA) studies. Although many uncertainties remain in the domestication timeline, current evidence suggests that the domestication of modern horses began in the Pontic-Caspian steppe at least [~]2700 BCE (before common era), or even earlier. However, it is not known how long remnant wild horse populations survived or when domestic horses were introduced into Northern Europe. In this study, we review the current knowledge of horse domestication, focusing on Northern Europe. We analysed prehistoric horses from western Russia to assess the body sizes of wild horses from the Ivanovskaya site (5900-3800 BCE) in the Pontic-Caspian steppe, and the body weight of one Lithuanian wild horse (4000-3800 BCE). Additionally, we analysed body sizes of Late Bronze Age-Early Roman Age horses (1100 BCE-300 CE; common era) and re-analysed body sizes and estimated rider weights of historic domestic horses from Lithuania (100-1400 CE). We searched for pathological changes and signs of bit wear indicative of bridling. Furthermore, we investigated maternal genetic diversity by sequencing ancient mitochondrial DNA. We found that wild horses from Ivanovskaya were intermediate in body size between earlier and more recent horses of the Eurasian Steppe, and that the Lithuanian wild horse weighed only [~]270 kg and Late Bronze Age-Early Roman Age horses 200-300 kg. Lithuanian domestic horses were pony-sized (< 130 cm on average). Bit wear was confirmed on one tooth, the oldest domestic horse in Lithuania (799-570 cal BCE). Another tooth showed signs of the Equine Odontoclastic Tooth Resorption and Hypercementosis (EOTRH) condition. Mitochondrial DNA was successfully amplified from one Ivanovskaya wild horse along with 25 other ancient samples, including Lithuanias oldest domestic horse. mtDNA diversity was high, revealing several maternal lineages.
Filipovic-Sadic, S.; Parker, C. A.; Mihailovic, M. K.; Milligan, J. N.; Turner, J. M.; Borel, S. L.; Le, V.; Markulin, T.; Janovsky, J. W.; Killinger, B. J.; Deshotel, M. J.; Reading, N. S.; Fredrickson, E. K.; Ji, Y.; Close, D.; Wright, J.; Williams, M.; Barrie, E. S.; Martin, K. E.; Gray, S. M.; Haynes, B. C.; Hall, B.
Show abstract
PurposeCarrier screening for hereditary conditions is challenged by genes with complex genomic architecture, where short-read sequencing can fail to detect clinically relevant variants. This study evaluated a unified, amplification-based nanopore sequencing workflow across multiple laboratories for comprehensive analysis of such loci. MethodsA modular long-read sequencing assay was evaluated across five laboratories using targeted PCR enrichment, Oxford Nanopore sequencing, and automated variant analysis. The workflow interrogated genes associated with spinal muscular atrophy, thalassemia, cystic fibrosis, fragile X syndrome, congenital adrenal hyperplasia, Gaucher disease, and hemophilia A. Performance was assessed against orthogonal methods for single nucleotide variants (SNVs), indels, copy-number variants, repeat expansions, and structural rearrangements. ResultsAcross 882 unique samples (1,266 tests), overall agreement with comparator methods exceeded 96% for variant-level detection and 97% for genotype status classification. Long-read sequencing enabled phasing of paralogous loci, integrated sizing and interruption analysis for FMR1 repeats, and simultaneous detection of SNVs and structural variants in globin loci and CYP21A2-TNXB region, reducing reliance on multiple workflows. ConclusionThis multisite evaluation suggests that targeted long-read sequencing can consolidate complex variant detection into a single workflow, improving analytical completeness and operational efficiency for carrier screening.
Bougiouri, K.; Irving-Pease, E. K.; Frantz, L. A. F.; Racimo, F.; Petr, M.
Show abstract
Recent advances in genome imputation have enabled the application of state-of-the-art statistical methods--originally developed for present-day genomes--to ancient genomes. One class of such methods, known as local ancestry inference (LAI), can model an individuals genome as a mosaic of tracts assigned to different putative ancestral sources, revealing patterns of genetic ancestry across the genome. However, most LAI methods have been designed to study recent admixture events in human history, and they generally assume large panels of present-day genomes. Despite the recent availability of high-quality imputed ancient genomes, it remains unknown to what degree LAI inference is reliable for such datasets. Ancient DNA is often characterized by heterogeneous geographic and temporal sampling, varying degrees of divergence between ancient source proxies and admixing populations, and complex demographic histories. Here, we performed an extensive set of population genetic simulations to evaluate the accuracy of four popular LAI methods-RFMix, FLARE, MOSAIC and simpLAI-under different demographic scenarios, various temporal sampling schemes, sample sizes, and admixture dates. We quantify the accuracy of these methods as a function of different parameters in practically relevant scenarios, and provide general guidelines for future studies utilizing LAI in ancient DNA research.
Tsutaya, T.; Hattori, T.; Onishi, R.; Budd, C. E.; Minoshima, H.; Takahashi, T.; Hirasawa, Y.; Chiku, S.; Omori, T.; Yamazaki, K.; Yoneda, M.; Kubo, D.; Ishida, H.; Sato, T.; Schulting, R. J.; Kato, H.; Weber, A. W.
Show abstract
Invasive species pose a major threat to biodiversity, yet our understanding of failed invasions/translocations, instances where alien/introduced species fail to establish, remains limited. Investigating the factors behind failed invasions is critical for improving prevention and management strategies for modern biological invasions. Here, we propose a novel framework that utilizes archaeological archives to uncover evidence of failed invasions. We estimated the biological and ecological factors contributing to the failed invasion of pigs from later prehistory to recent times (299 cal BC to 1900 AD) on Rebun Island in far northern Japan by synthesizing the evidence obtained from stable isotopes, zooarchaeology, and historical documents. Despite the anthropogenic introductions of pigs into Rebun Island, pigs did not establish a feral population and disappeared after ca. 1200 AD. We identified reduced propagule pressure, abiotic resistance due to the cold climate, and decreased resources as the three key factors that contributed to the disappearance of pigs. Pigs are one of the most widespread invasive species globally, and this study represents a novel approach to studying failed invasions using archaeological data, which aligns with the framework of conservation paleobiology.
Rodriguez, L. K.; Schallhart, S.; Hobmeier, P.; Curran, T.; Perez-Jorge, S.; Prieto, R.; Oliveira, C.; Silva, M. A.; Thalinger, B.
Show abstract
O_LIEnvironmental DNA (eDNA) analyses have become a powerful tool for non-invasive biodiversity monitoring, yet the applicability of population genetic approaches to environmental samples remains largely unexplored. Even when genetic traces originate from a single individual, low target DNA concentrations and amplification or sequencing artefacts can compromise downstream genetic inferences. Here, we present a novel approach for obtaining demographic insights and lineage-level mitogenomic information from aquatic eDNA samples collected near vertebrate individuals. C_LIO_LIPaired eDNA and tissue samples were collected during sperm whale (Physeter macrocephalus) encounters in the Azores. Samples were screened for the presence of vertebrate eDNA and analyzed with a novel molecular sex identification assay. Additionally, long-range PCR was used to amplify up to five mitochondrial DNA fragments ([~]3-4k bp) before subsequent sequencing on an Oxford Nanopore Technologies platform. A stringent three-tier filtering framework capable of identifying true mitogenomic variation across eDNA samples was developed for maximum recovery of genetic diversity at the haplogroup level. By benchmarking eDNA samples via their paired tissues, parameter values were optimized to maximize concordance and minimize spurious variant calls. C_LIO_LISexing was successful for 50% of eDNA samples, with 96% concordance to paired tissues, and marine vertebrate DNA concentration significantly predicted sexing success. Further, Medaka polishing produced high identity mitochondrial consensus sequences (>16 kb) from eDNA samples. Across filtering regimes in the framework, curated SNP panels comprising up to 453 high-confidence mitochondrial SNPs resolved 19 haplogroups, with 93% concordance between eDNA and tissue samples. An intermediate bioinformatics filtering strategy maximized biologically accurate haplogroup recovery while minimizing sequencing artefacts, providing the most reliable lineage-level inferences. C_LIO_LIThis integrative approach demonstrates that targeted nuclear assays combined with long-range mitochondrial sequencing can recover individual-level genetic information from aquatic eDNA. By defining analytical thresholds governing success, the framework advances non-invasive genetic monitoring of populations via eDNA and enables population-level monitoring and conservation of endangered and genetically-vulnerable species. C_LI
Bravington, M. V.; Baylis, S. M.; Eveson, P.; Feutry, P.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWClose-Kin Mark-Recapture (CKMR) is a statistical framework for estimating demographic parameters of wild populations. Instead of recapturing individuals, it relies on the identification of closely-related pairs such as parents and offspring, or siblings. By measuring how often such close-kin are "recaptured" among sampled animals (whether alive or dead), scientists can estimate demographic parameters such as census size, mortality rates, and connectivity. CKMR is starting to change fisheries and wildlife management by giving more reliable demographic information, even for many species that resist conventional approaches. Here we introduce the kinference R package, which provides a set of tools for finding close-kin pairs among thousands of samples each genotyped at thousands of SNPs, and for associated quality control. The CKMR context implies different requirements and assumptions to many other kinship programs. In particular, kinference accounts empirically for linkage without requiring a genome assembly, is able to estimate and control false-negative and false-positive probabilities, and can cope with null alleles. The package has been developed and used in numerous CKMR projects since 2017. This paper documents the assumptions, statistical algorithms, and intended workflow for kinference.
Medina Tretmanis, J.; Avila-Arcos, M. C.; Jay, F.; Huerta-Sanchez, E.
Show abstract
MotivationLocal Ancestry Inference (LAI) allows us to study evolutionary processes in admixed populations[1], uncover ancestry-specific disease risk factors[2], and to better understand the demographic history of these populations[3]. Many methods for LAI exist, however, these methods usually focus on cases of intercontinental admixture. In this work, we evaluate both existing and novel methods in challenging scenarios, such as downsampled reference panels, intracontinental admixture, and distant admixture events. ResultsWe present four novel LAI implementations based on neural network architectures, including Bidirectional Long Short-Term Memory and Transformer networks which have not previously been used for LAI. We compare these novel implementations to existing methods for LAI across a variety of scenarios using the 1 Thousand Genomes dataset and other synthetic datasets. We find that while all networks achieve high performance for intercontinental admixture scenarios, inference power is comparatively low for scenarios of intracontinental or distant admixture. We further show how our implementations achieve the best performance of all methods through specialized preprocessing and inference smoothing steps. AvailabilityAll implementations and benchmarking code available at https://github.com/Jazpy/LAINNs.
Matsunami, M.; Kawai, Y.; Speidel, L.; Koganebuchi, K.; Takigami, M.; Kakuda, T.; Adachi, N.; Kameda, Y.; Katagiri, C.; Shinzato, T.; Shinzato, A.; Takenaka, M.; Doi, N.; NCBN Controls WGS Consortium, ; Bird, N.; Hellenthal, G.; Yoneda, M.; Omori, T.; Ozaki, H.; Sakamoto, M.; Kinoshita, N.; Imamura, M.; Maeda, S.; Shinoda, K.-i.; Kanzawa-Kiriyama, H.; Kimura, R.
Show abstract
Characterized by the earliest use of pottery, the Jomon culture was a unique Neolithic culture that spread throughout the Japanese Archipelago. Previous archaeological evidence suggests that Jomon hunter-gatherers colonized the southernmost islands, the Ryukyu Archipelago, by approximately 7,000 years before present (YBP). However, genetic characteristics of the Ryukyu Jomon population and its contribution to the modern population have not been elucidated yet. In this study, we newly sequenced 273 modern and 25 ancient (6,700-900 YBP) whole genomes collected across the Ryukyu Archipelago. Our analysis demonstrated a genetic differentiation between the Hondo (Japanese mainland) and Ryukyu Jomon, dating back to [~]6,900 YBP. After the divergence from the Hondo Jomon, the Ryukyu Jomon experienced severe bottlenecks, with an effective population size of [~]2,000. Admixture between the Ryukyu Jomon and migrants from the historic Hondo population occurred [~]1,000 YBP, which corresponds to the widespread adoption of iron tools and agriculture in the Central Ryukyus. Different demographic histories between modern Hondo and Ryukyu populations resulted in different rates of Jomon ancestry in these populations. By providing a new perspective on the peopling of the Ryukyu Archipelago, this study significantly enhances our understanding of cultural transitions in the region.
Kaur, R.; Dewan, C.; Chauhan, I.; Sharma, K.; Sharma, S.
Show abstract
Assessing reproducibility across different molecular profiling studies is a persistent methodological challenge (Zhang et al., 2009; Sweeney et al., 2017; Ioannidis, 2005). Differences in platform technology, cohort composition, analytical pipelines, and feature definitions often make it difficult to interpret cross-study comparisons based solely on gene-identity overlap. In this study, we conducted a retrospective computational analysis of seven publicly available analytical datasets (including alternative analytical pipelines applied to the same cohort) derived from five biologically independent peripheral blood transcriptomic and DNA methylation cohorts, comprising 3,487 samples (1,824 Parkinsons disease cases and 1,663 controls). Reproducibility was evaluated using gene-identity overlap, enrichment-based comparisons, and a permutation-based framework to assess directional consistency of effect estimates across datasets. We also tested the robustness of results by varying false discovery rate thresholds and applying alternative probe-to-gene collapsing strategies. All analyses were performed using reproducible workflows implemented in R and Python with fixed random seeds. Across independent cohorts, gene-identity overlap was generally limited, with enrichment ratios close to one, especially when datasets were generated using different platforms. In several datasets, limited numbers of statistically significant features further constrained overlap-based comparisons. In contrast, directional consistency showed greater stability. High levels of directional consistency were observed across independent cohort comparisons when restricted to overlapping statistically significant features and remained stable across statistical thresholds (90.0% at FDR < 0.05 and 82.8% at FDR < 0.10). When evaluated across the full shared gene universe without conditioning on statistical significance, directional consistency was substantially lower ([~]30 to 32%) but remained significantly above permutation-based null expectations. Permutation testing confirmed that the observed directional consistency exceeded what would be expected by chance. A combined analysis including methodological replicates (n [≥] 3 datasets) showed 98.3% directional consistency; however, this estimate includes non-independent analytical pipelines applied to the same cohort and reflects analytical stability rather than independent biological replication. Rather than introducing a new statistical method, this study examines how commonly used reproducibility metrics behave under crossstudy heterogeneity and identifies their practical limitations and appropriate use boundaries.
Ohyama, Y.; Shimamura, M.; Asami, Y.; Tourlousse, D. M.; Togawa, N.; Narita, K.; Hayashi, N.; Terauchi, J.; Sekiguchi, Y.; Kawasaki, H.; Miura, T.
Show abstract
Accurate quantification of fungi is important for a myriad of applications but remains challenging. Previously, we demonstrated that an approach called the adenine-HPLC method can quantify bacteria, including those with aggregating properties that are difficult to quantify using conventional methods, by measuring cellular adenine derived from DNA and converting the adenine amount to genome copy number, without being influenced by cell morphology. However, in this study, when this adenine-HPLC method was applied to the quantification of budding yeast as a model fungus, accurate measurement proved impossible. This limitation was attributed to adenine release from other adenine-containing biomolecules, such as RNA and ATP, and we therefore developed a method that suppresses adenine release from these molecules. This method involves reducing the temperature of the acid treatment and prewashing the cells before acid treatment. In addition, we incorporated a process that corrects for the naturally occurring free adenine level as background during total adenine measurement. The improved adenine-HPLC method based on these modifications enables accurate quantification of budding yeast using genomic DNA content in whole cells as the quantification unit.
Heckman, C. A.
Show abstract
BackgroundHigh-content assays (HCAs) have problems distinguishing biologically significant effects from the incidental effects of non-repeatable technical factors. Non-repeatable results are attributed to variations in the cell culture environment and the numerous, heterogeneous descriptors evaluated. The aim here was to determine whether preprocessing operations impacted the reproducibility of class assignments of experimental data. MethodsBatch effects that could affect reproducibility, i.e., signal/noise ratio, instrumental conditions, and segmentation, were controlled variables. The remaining batch effects, variations in materials, personnel, and culture environment could not be controlled. Descriptors values were measured directly from images. Exploratory factor analysis was used to solve the identifiable and interpretable feature, factor 4. In each of five trials, one sample was treated with the same chemical mixture (EXP) and another with the solvent vehicle alone (CON). ResultsRepeated CON and EXP samples showed significant differences among factor 4 means in data regularized within each trial. The mean of Trial 3 CON differed significantly from all other CON samples. These differences disappeared upon regularization to comprehensive databases. Among repeated EXPs, the Trial 2 mean differed from three other EXPs, but regularization to comprehensive databases had little effect. However, classification patterns were unchanged after regularization to any comprehensive database derived by the same protocol. After regularization to datasets derived by two different protocols, the classification pattern differed but only reflected elevation of differences that had been marginal to statistical significance. Outlier removal was deleterious. Even with the most sparing definition of outliers, over 3% of a single samples contents were removed from most trials. Elimination based on the overall within-trial distributions caused type I and type II errors. ConclusionsNon-repeatable factor 4 means in repeated trials had negligible influence on classification outcomes, so repeatability may not be a good indicator of assay quality. Irreducible batch effects, combined with small sample sizes and skewed distributions of descriptors values, may account for non-repeatability. As the current results are based on real-world data, they suggest that non-repeatability is an uncorrectable feature of these assays. Classification patterns are not affected by several irreducible technical factors, namely materials, personnel, and non-repeatable environmental variables.
Smith, C.; Peter Durairaj, R. R.; Randall, E. L.; Aston, A. N.; Heraty, L.; Elsayed, W.; Murillo, A.; Dion, V.
Show abstract
The expansion of short tandem repeats is a feature of over 60 different human diseases. Ongoing somatic instability throughout a patients lifetime can influence disease progression and has emerged as a therapeutic target. Understanding its mechanism is essential for the identification of both drug targets and therapeutic interventions. A major obstacle towards this translational goal has been to measure changes in repeat size distribution in a timely manner. To address this, here we present Single Clone-based Instability Assay (SCIA), a streamlined experimental design that saves weeks in assessing the effect of a gene knockout on repeat instability. The approach avoids bulk cultures and does not require a reporter cell line. It uses targeted long-read sequencing as a readout for repeat instability. We have validated the approach using FAN1, PMS1, and MLH1 knockouts in HEK293-derived cells. We provide a visualization software that generates delta plots, extracts the instability frequency, the bias towards expansion or contraction, and the average size of the changes. Using SCIA, we find that although FAN1 knockout clones showed increased frequency of expansions, the size of the expansions were smaller. This highlights the wealth of information that can be extracted and the potential for novel insights into the mechanism of repeat instability.